Computer Methods and Programs in Biomedicine — Latest Matching Preprints

1

MCA-UNet: A Multi-Scale Context and Attention U-Net for Colorectal Polyp Segmentation

Dong, Y.; Fang, G.; Du, R.; Hu, H.; Fang, Z.; Guo, C.; Lu, R.; Jia, Y.; Tian, Y.; Wang, Z.

2026-03-16 gastroenterology 10.64898/2026.03.11.26348049 medRxiv

Top 0.1%

14.3%

Show abstract

IntroductionTo propose an improved U-Net-based segmentation model for colorectal polyp segmentation, aiming to address the challenges of variable lesion morphology, ambiguous boundaries, complex background interference, and insufficient cross-level feature fusion in endoscopic images [5,12]. MethodsAn improved network termed MCA-UNet was developed based on U-Net [5]. The model incorporates a multi-scale context convolution block (MCCB) to enhance multi-scale feature extraction and an attention-guided feature fusion module (AGFF) to optimize skip-feature selection and fusion in the decoder. Experiments were conducted on publicly available colorectal polyp image datasets, including Kvasir-SEG and CVC-ClinicDB [13-15]. Four models, including U-Net, U-Net+MCCB, U-Net+AGFF, and MCA-UNet, were compared, and all models were trained for 100 epochs. Dice, intersection over union (IoU), and mean absolute error (MAE) were used as the main evaluation metrics [20]. ResultsOn the mixed validation set, the Dice scores of U-Net, U-Net+MCCB, U-Net+AGFF, and MCA-UNet were 0.742, 0.771, 0.754, and 0.783, respectively; the corresponding IoU values were 0.603, 0.635, 0.618, and 0.649; and the MAE values were 0.102, 0.090, 0.097, and 0.086. Compared with the baseline U-Net, MCA-UNet improved Dice and IoU by 5.53% and 7.63%, respectively, while reducing MAE by 15.69%. Comparisons on the Kvasir-SEG and CVC-ClinicDB validation subsets further demonstrated the more stable performance of the proposed model. ConclusionBy jointly integrating multi-scale contextual modeling and attention-guided feature fusion, MCA-UNet effectively improves the accuracy and robustness of colorectal polyp segmentation and may provide useful support for intelligent endoscopic image analysis [12,17,18].

2

Deep Learning-Based Missing Value Imputation for Heart Failure Data from MIMIC-III: A Comparative Study of DAE, SAITS, and MICE+LightGBM

sharma, s.; KAUR, M.; GUPTA, S.

2026-02-11 health systems and quality improvement 10.64898/2026.02.10.26345979 medRxiv

Top 0.1%

6.9%

Show abstract

BackgroundElectronic Health Records(EHR) are very crucial for Clinical Decision Support Systems and for proper care to be delivered to ICU heart failure patients, there is often missing data due to monitoring device errors thus the need for robust imputation methodologies. ObjectiveTo compare and evaluate three different methodologies for imputing missing data for heart failure patients from the MIMIC-III database: Denoising Autoencoder (DAE), Self-Attention Imputation for Time Series (SAITS), and Multiple Imputation by Chained Equations (MICE) with LightGBM. MethodsAnalysis of 14,090 ICU admissions for patients with heart failure was performed using data from the MIMIC-III database. Features were selected based off of clinical relevance, and 19 clinical features were selected through a combination of Random Forest analysis, correlation analysis, and Mutual Information. The introduction of artificial missing values of 20%, 30%, and 50% was applied to the data set, and then 3 imputation methodologies were evaluated with the DAE, SAITS, and MICE+LightGBM. The performance of each imputation methodology was evaluated using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Normalized Root Mean Square Error (NRMSE). ResultsBoth DAE and SAITS had superior performance on the imputation of missing values across all percentages of missing values. At 20% missingness, DAE had mean MAE = 0.004967, RMSE = 0.005217, and NRMSE = 3.260893 while SAITS had mean MAE = 0.005461, RMSE = 0.005797, and NRMSE = 3.244695; thus MICE+LightGBM resulted in a higher number of errors. At 50% missingness, the SAITS methodology demonstrated the best performance followed by DAE and MICE+LightGBM methods demonstrated decreased performance. The deep learning methodologies maintained a consistent level of accuracy between the clinical variables measured. ConclusionsOur analysis indicates that deep learning-based imputation methodologies significantly outperform traditional methodologies for imputing missing values in ICU heart failure data thus supporting the implementation of these methodologies into Clinical Decision Support Systems for heart failure patients.

3

A clinicoradiological model for preoperative prediction of lateral lymph node metastasis in rectal cancer

Shen, Q.; Wang, G.; Fu, M.; Yao, K.; Yang, Y.; Zeng, Q.; Guo, Y.

2026-04-15 gastroenterology 10.64898/2026.04.13.26350816 medRxiv

Top 0.1%

6.5%

Show abstract

Background: Lateral lymph node metastasis (LLNM) is associated with poor prognosis in patients with rectal cancer and may influence the indication for lateral lymph node dissection. Accurate preoperative identification of LLNM remains challenging. This study aimed to develop and internally validate a clinicoradiological model for preoperative prediction of LLNM in rectal cancer. Methods A retrospective cohort of 64 patients undergoing lateral lymph node dissection (LLND) for rectal cancer was analysed; 21 (32.8%) had pathological lateral lymph node metastasis (LLNM). A prespecified preoperative clinicoradiological model was fitted using penalised logistic regression with L2 regularisation (ridge), incorporating MRI-measured lateral lymph node short-axis diameter (LLN-SAD), dichotomised clinical T stage (T3-4 vs T1-2), dichotomised clinical N stage (N+ vs N0), and log(CA19-9+1). Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), calibration analysis, and bootstrap internal validation. Results The model showed good discrimination (AUC 0.914), with an optimism-corrected AUC of 0.887 on bootstrap validation. Calibration remained acceptable after optimism correction (calibration intercept -0.127; slope 1.045). Decision curve analysis suggested net benefit across clinically relevant threshold probabilities, particularly between 0.10 and 0.30. The model was implemented as a web-based calculator to facilitate clinical use. Conclusion This clinicoradiological model showed good discrimination, acceptable calibration, and potential clinical utility for preoperative assessment of LLNM risk in rectal cancer. It may assist individualized risk stratification and treatment planning, although external validation is required before routine clinical implementation.

4

Vision Transformers Based AI Models For Predicting Colorectal Cancer from Digital Pathology WSI: Use Case Of MHIST dataset

Kondejkar, T.; Tunik, G.; Amal, S.

2026-02-04 gastroenterology 10.64898/2026.02.03.26345516 medRxiv

Top 0.1%

6.4%

Show abstract

This study investigates the efficacy of transformer-based deep learning architectures--specifically, Vision Transformer (ViT), Class Attention in Image Transformers (CaiT), and Data-Efficient Image Transformers (DeiT)--for the binary classification of colorectal polyps using the Minimalist Histopathology Image Analysis Dataset (MHIST). The dataset comprises 3,152 hematoxylin and eosin (H&E)-stained Formalin Fixed Paraffin-Embedded (FFPE) images annotated as either Hyperplastic Polyps (HP) or Sessile Serrated Adenomas (SSA). A rigorous evaluation was conducted using a 5-fold stratified cross-validation methodology, and performance was quantified using metrics including accuracy, precision, recall, F1-score, and AUC-ROC. Experimental results revealed that transformer architectures, particularly CaiT (accuracy of 90.18%, AUC-ROC of 95.52%), outperformed traditional convolutional neural networks (CNNs). The superior performance of CaiT is attributed to its specialized class-attention mechanisms, effectively capturing nuanced morphological differences essential for accurate histopathological classification. These findings underscore the potential of transformer-based models to enhance diagnostic precision, reduce variability in pathological assessment, and facilitate earlier and more reliable colorectal cancer screening.

5

DIA-PINN. A physics-informed machine learning method to estimate global intrinsic diastolic chamber properties of the left ventricle from pressure-volume data

Fernandez Topham, J.; Guerrero Hurtado, M.; del Alamo, J. C.; Bermejo, J.; Martinez Legazpi, P.

2026-03-06 cardiovascular medicine 10.64898/2026.03.02.26347245 medRxiv

Top 0.1%

6.2%

Show abstract

BackgroundPressure-volume (PV) loop analysis remains the gold standard for assessing the intrinsic global diastolic properties of the left ventricle (LV). Traditional fitting techniques rely on local, phase-constrained fittings and are limited due to their sensitivity to noise, landmark selection, violation of assumptions, and non-convergence. ObjectiveTo develop and validate DIA-PINN, a physics-informed neural network (PINN) framework capable of calculating intrinsic diastolic properties of the LV from measured instantaneous PV data, combining mechanistic interpretability with machine learning flexibility. MethodsInstantaneous LV diastolic pressure was modeled as the sum of 1) time-dependent relaxation-related pressure and 2) volume-dependent recoil and stiffness-related pressures. DIA-PINN was trained using time, LV pressure and volume as inputs, enforcing data fidelity, model consistency, and physiological plausibility within the loss function. Performance was evaluated in 4,000 Monte Carlo simulations of LV PV-loops, and in clinical data from 59 patients who underwent catheterization (39 with heart failure and normal ejection fraction and 20 controls). DIA-PINN derived indices were compared to those obtained from a previously validated global optimization method (GOM). ResultsOn the simulation data, DIA-PINN accurately recovered all constitutive indices (intraclass correlation coefficients near unity) and improved GOM performance. On the clinical data, diastolic indices derived using DIA-PINN strongly correlated with GOM estimates (R>0.90, p<0.001) but were insensitive to initialization. DIA-PINN performed best under vena cava occlusion, as varying preload improved parameter identifiability. ConclusionsWhen applied to instantaneous pressure-volume data, a generalizable PINN framework, DIA-PINN, provides an improved method for assessing global intrinsic diastolic properties of cardiac chambers. New & NoteworthyOur work introduces DIA-PINN, a physics-informed neural network framework to process instantaneous ventricular pressure-volume data, solving a mechanistic model of diastole with machine learning techniques. Compared to current conventional or optimization-based approaches, the PINN provides the most reliable estimates of diastolic stiffness, relaxation, and elastic recoil, unsensitive to initialization. By embedding physiological constraints into network training, this approach achieves robust, interpretable, and clinically applicable quantification of gold-standard metrics of intrinsic global diastolic chamber properties.

6

Wavelet-Domain Multi-Representation and Ensemble Learning for Automated ECG Analysis

Chato, L.; Kagozi, A.

2026-02-17 bioengineering 10.64898/2026.02.14.705908 medRxiv

Top 0.1%

6.2%

Show abstract

Accurate diagnosis of cardiac abnormalities from electrocardiogram signals remains a central challenge in automated cardiovascular assessment. This study investigates the efficiency of time-frequency representations and deep learning architectures in classifying 12-lead ECGs into five diagnostic super-classes using the PTB-XL dataset. Continuous Wavelet Transform is applied to generate time- frequency representations, scalograms and phasograms, representing spectral energy and phase distributions, respectively. We experiment with both early and late information fusion strategies using several convolutional and transformer-based networks of a custom Convolutional Neural Network, Hybrid Deep Learning, transfer learning, feature fusion, and ensemble modeling, and weighted loss strategies. An ensemble fusion of models trained on time-frequency representation and time representation achieved the best overall performance of Area Under Curve of 0.9233 surpassing individual modalities. To improve the results further, weighted focal loss is used to improve the low classification rates in some labels due to imbalanced data. The results highlight the potential of multi-representation wavelet fusion for interpretable and generalizable ECG classification.

7

Physics-Based Growth and Remodeling Modeling for Virtual Abdominal Aortic Aneurysm Evolution and Growth Prediction

Jahani, F.; Jiang, Z.; Nabaei, M.; Baek, S.

2026-03-03 cardiovascular medicine 10.64898/2026.02.26.26347026 medRxiv

Top 0.1%

6.2%

Show abstract

Computational growth and remodeling (G&R) models have been extentively used to investigate abdominal aortic aneurysm (AAA) progression and to support clinical decision-making. However, the development of robust predictive models is often limited by the scarcity of large-scale longitudinal imaging datasets. In this study, we propose a physics-based G&R framework to simulate AAA shape evolution and generate a virtual cohort of aneurysms, thereby addressing data limitations and enabling integration with data-driven machine learning approaches for growth prediction. The proposed arterial G&R model incorporates key mechanisms influencing aneurysm progression, including elastin degradation and stress-mediated collagen production. A modified elastin degradation formulation was introduced to generate realistic aneurysm geometries exhibiting clinically relevant features such as asymmetry and tortuosity. By systematically varying parameters governing elastin damage and collagen production, 200 distinct G&R simulations were performed to produce a diverse set of AAA geometries. The dataset was further expanded using kriging-based spatial interpolation to construct a large in silico cohort. The synthetic dataset, combined with longitudinal imaging data from 25 patients, was used to train and validate four machine learning models: Deep Belief Network (DBN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). A two-step training strategy was adopted to predict maximum aneurysm diameter and growth rate based on prior geometric characteristics. The LSTM model achieved the highest performance for maximum diameter prediction (R{superscript 2} = 0.92), while the RNN demonstrated strong overall performance (R{superscript 2} = 0.90 for maximum diameter and 0.89 for growth rate). The DBN and GRU models also showed competitive predictive capability. Overall, this study demonstrates that integrating physics-based G&R simulations with machine learning enables accurate prediction of AAA growth and maximum diameter. The proposed framework provides a scalable strategy for augmenting limited clinical datasets and offers a promising tool to support personalized risk assessment and treatment planning.

8

QRS Detection by Combinatorial Optimization With MLP Assisted Peak Scoring

Hopenfeld, B.

2026-04-22 bioengineering 10.64898/2026.04.19.719501 medRxiv

Top 0.1%

5.0%

Show abstract

A multiple channel QRS detector is described. The detector partitions raw signal segments into peak domains, extracts parameters associated with the peak domains, and scores peaks based on these parameters. A multi-layer perceptron (MLP) with 11 inputs generates provisional peak scores, which are refined through application of rules involving 20-30 parameters. An optimal sequence of supra threshold peaks is determined. Separately, combinatorial optimization determines an optimal structured heart rhythm sequence. Adjudication between the general supra threshold sequence and the structured sequence depends on noise level, peak quality, and rhythm structure quality. For multiple channel fusion, peak scores are determined as a noise weighted function of channel peak scores. The MLP was trained on approximately 70% of channel 1 of the MIT-BIH Arrhythmia Database. The supplementary rules were heuristically chosen over all channel 1 records. Sensitivity (SE) and positive predictive value (PPV) of the detector applied to channel 2 were a function of the noise threshold used to discard segments. At a noise level that would exclude 2.2% of channel 1 data, the SE and PPV were 99.67% and 99.75% respectively. Importantly, even in high noise, the detector was able to track large scale features of heart rhythm. Fused channel 1 and channel 2 SE and PPV were 99.96% and 99.98% respectively. The present algorithm points the way toward maximal extraction of heart rhythm information from noisy signals, and the potential to reduce false alarms generated by automated rhythm analysis software.

9

MOE-ECG: Multi-Objective Ensemble Fusion for Robust Atrial Fibrillation Detection Using Electrocardiograms

Peimankar, A.; Hossein Motlagh, N.; K. Khare, S.; Spicher, N.; Dominguez, H.; Abolghasemi, V.; Fujiwara, K.; Teichmann, D.; Rahmani, R.; Puthusserypady, S.

2026-03-30 health informatics 10.64898/2026.03.28.26349522 medRxiv

Top 0.1%

4.9%

Show abstract

Background: Atrial fibrillation (AFib) is the most common sustained arrhythmia in the world, imposing a heavy clinical and economic burden on global healthcare systems. Early detection of AFib can reduce mortality and morbidity, while helping to alleviate the growing economic burden of cardiovascular diseases. With the increasing availability of digital health technologies, computational solutions have great potential to support the timely diagnosis of cardiac abnormalities. Objectives: With the increasing availability of electrocardiogram (ECG) data from clinical and wearable devices, manual interpretation has become impractical due to its time-consuming and subjective nature. Existing automated approaches often rely on single classifiers or fixed ensembles that primarily optimize predictive accuracy while neglecting model diversity, which leads to limited robustness and generalization across heterogeneous datasets. Therefore, this study aims to develop a robust and diversity-aware framework for automatic AFib detection that simultaneously improves classification performance and model generalizability. To this end, we propose MOE-ECG, a multi-objective ensemble selection and fusion framework that explicitly optimizes both predictive performance and inter-model diversity for reliable AFib detection from ECG recordings. Methods: The proposed multi-objective ensemble (MOE) framework uses ensemble selection as a bi-objective optimization problem and employs multi-objective particle swarm optimization to identify complementary classifiers from a heterogeneous model pool. Unlike conventional ensembles, it explicitly optimizes both predictive performance and diversity and integrates Dempster-Shafer theory for uncertainty-aware decision fusion. After filtering the ECG signals to remove baseline wander and noise, they were segmented into windows of 20, 60, and 120 heartbeats with 50% overlap. The proposed approach was evaluated over five independent runs to assess its stability and generalization. Fifteen statistical and nonlinear features were obtained from the RR-intervals of the pre-processed ECG signals, of which eight features were selected with correlation analysis to capture subtle information from the ECG data. We trained and evaluated the performance of the proposed model in three open source databases, namely, the MIT-BIH Atrial Fibrillation Database, Saitama Heart Database Atrial Fibrillation, and Long-Term AF Database. Results: The proposed approach achieved the best overall performance on 60-beat segments, with an average accuracy of 89.85%, precision of 91.14%, recall of 94.19%, an F1-score of 92.64%, and area under the curve (AUC) of around 0.95. Statistical analysis using Holm-adjusted Wilcoxon tests confirmed significant improvements (p<0.05) compared to both the best individual classifier and the unoptimized average ensemble of all classifiers. These findings show that the proposed selection and evaluation methodology, rather than group aggregation alone, is the key driver of performance improvements. Conclusion: The results obtained demonstrate that the MOE-ECG model offers a robust, accurate, and reliable solution for the detection of AFib from short ECG segments. The empirical findings, in general, confirm that multi-objective ensemble fusion enhances diagnostic performance and offers robust predictions that will open up possibilities for real-time AFib detection in clinical and tele-health settings.

10

Quality versus quantity of training datasets for artificial intelligence-based whole liver segmentation

Castelo, A.; O'Connor, C.; Gupta, A. C.; Anderson, B. M.; Woodland, M.; Altaie, M.; Koay, E. J.; Odisio, B. C.; Tang, T. T.; Brock, K. K.

2026-02-18 radiology and imaging 10.64898/2026.02.17.26346486 medRxiv

Top 0.1%

4.9%

Show abstract

Artificial intelligence (AI) based segmentation has many medical applications but limited curated datasets challenge model training; this study compares the impact of dataset annotation quality and quantity on whole liver AI segmentation performance. We obtained 3,089 abdominal computed tomography scans with whole-liver contours from MD Anderson Cancer Center (MDA) and a MICCAI challenge. A total of 249 scans were withheld for testing of which 30, MICCAI challenge data, were reserved for external validation. The remaining scans were divided into mixed-curation and highly-curated groups, randomly sampled into sub-datasets of various sizes, and used to train 3D nnU-Net segmentation models. Dice similarity coefficients (DSC), surface DSC with 2mm margins (SD 2mm), the 95th percentile of Hausdorff distance (HD95), and 2D axial slice DSC (Slice DSC) were used to evaluate model performance. The highly curated, 244-scan model (DSC=0.971, SD 2mm=0.958, HD95=2.98mm) performed insignificantly different on 3D evaluation metrics to the mixed-curation 2,840-scan model (DSC=0.971 [p>.999], SD 2mm=0.958 [p>.999], HD95=2.87mm [p>.999]). The 710-scan mixed-curation (Slice DSC=0.929) significantly outperformed the highly curated, 244-scan model (Slice DSC=0.923 [p=0.012]) on the 30 external scans. Highly curated datasets yielded equivalent performance to datasets that were a full order of magnitude larger. The benefits of larger, mixed-curation datasets are evidenced in model generalizability metrics and local improvements. In conclusion, tradeoffs between dataset quality and quantity for model training are nuanced and goal dependent.

11

AI and Hierarchical clustering techniques for accurate patient stratification

Diaz Ochoa, J. G.; Puskaric, M.; Layer, N.; Jensch, A.; Knott, M.; Krohn, A.

2026-03-15 health informatics 10.64898/2026.03.13.26348331 medRxiv

Top 0.1%

4.8%

Show abstract

Graph-based methods for data representation and analysis are well suited for encoding both data points and their interrelationships. This approach integrates data and topology, enabling the representation of interrelated information. In this study, we represent patient cohorts as cohort graphs and discuss their application for real-world patient data. We particularly focus on developing methods to cluster patients with similar symptoms and examine how bias parameters (such as sex and age group) influence interlinking within CGs, thereby improving results for accurate patient stratification and personalized decision-making in a clinical context. In particular we illustrate how by considering sex and age groups we can improve the symptom-clustering of a patient population with lung and gastro-intestinal cancer. Finally, we discuss the essential role of high-performance computing (HPC) in upscaling analytical methods for CGs.

12

Ability to Detect Changes and Minimal Important Difference of Real-World Digital Mobility Outcomes in Proximal Femoral Fracture Patients

Jansen, C.-P.; Braun, J.; Alvarez, P.; Berge, M. A.; Blain, H.; Buekers, J.; Caulfield, B.; Cereatti, A.; Del Din, S.; Garcia-Aymerich, J.; Helbostad, J. L.; Klenk, J.; Koch, S.; Murauer, E.; Polhemus, A.; Rochester, L.; Vereijken, B.; Puhan, M. A.; Becker, C.; Frei, A.

2026-03-06 geriatric medicine 10.64898/2026.03.06.26347770 medRxiv

Top 0.1%

4.4%

Show abstract

BackgroundOlder adults walking has so far been evaluated using standardised assessments of walking capacity within a clinical setting. By taking the evaluation out of the laboratory into the real world, this study provides first evidence of the ability of Digital Mobility Outcomes (DMOs) to detect changes over time and the Minimal Important Difference (MID) in patients after proximal femoral fracture (PFF). This will guide the implementation of DMOs in research and clinical care. MethodsFor this multicenter prospective cohort study, 381 community-dwelling older adults were included within one year after sustaining a PFF and assessed at two time points, separated by six months. Walking activity and gait DMOs were measured using a single wearable device worn on the lower back for up to seven days. A global impression of change question and three mobility-related outcome measures (Late-Life Function and Disability Instrument; Short Physical Performance Battery; 4m gait speed) were used as anchor variables. To assess each DMOs ability to detect changes, we calculated the standardized mean change as effect size. For estimating MIDs, both distribution-based and anchor-based methods were applied, followed by triangulation by experts if at least three anchor-based estimates were available per DMO, resulting in single-point estimates. ResultsAll three anchor variables demonstrated substantial changes. Overall, 10 out of 24 available DMOs showed large and 7 DMOs moderate positive effects in the expected direction of the respective anchors. Seven DMOs showed no or only small effects. For 12 DMOs, at least three anchor-based estimates were available, enabling MID triangulation. MIDs for walking activity DMOs per day were: a walking duration of 10 minutes, a step count of 1,000 steps, 50 walking bouts (WB), and 15 WBs in WBs over 10 seconds. For gait DMOs, depending on the walking bout length, MIDs for walking speed were between 0.04 m/s and 0.08 m/s, and MIDs for cadence between 4 and 6 steps/minute. Almost all DMOs showed a strong ability to detect improvement in mobility, but rarely in detecting decline. ConclusionsFor the first time, MIDs are presented for real-world DMOs in PFF patients. These MIDs inform sample size requirements and interpretation of intervention effects for clinical trials, thereby providing guidance and reassurance for clinicians and regulatory bodies.

13

PRAM: Post-hoc Retrieval Augmentation for Parameter-Free Domain Adaptation of ICU Clinical Prediction Models

Jeong, I.; Lee, T.; Kim, B.; Park, J.-H.; Kim, Y.; Lee, H.

2026-04-05 health systems and quality improvement 10.64898/2026.04.03.26350132 medRxiv

Top 0.1%

4.0%

Show abstract

Background Clinical prediction models degrade when deployed across hospitals, yet retraining requires technical expertise, labeled data, and regulatory re-approval. We investigated whether post-hoc retrieval augmentation of a frozen model's output, analogous to retrieval-augmented methods in natural language processing, can mitigate this degradation without any parameter modification. Methods We developed the Post-hoc Retrieval Augmentation Module (PRAM), which combines predictions from a frozen base model with outcome information retrieved from similar patients in a local patient bank. Five base models (logistic regression through CatBoost) and three retrieval strategies were evaluated on 116,010 ICU patients across three databases (MIMIC-IV, MIMIC-III, eICU-CRD) for acute kidney injury (AKI) and mortality prediction. A bank size deployment simulation modeled performance from zero to full local data accumulation, complemented by source bank cold start, stress tests, and calibration experiments. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC). Results Retrieval benefit was inversely associated with base model complexity ({rho} = -0.90 for AKI, -1.00 for mortality): simpler models benefited more, consistent with retrieval capturing residual signal unexploited by the base model. PRAM showed a statistically significant monotone dose-response between bank size and prediction performance across all six outcome-target combinations (Kendall {tau} trend test, q = 0.031 for all). At the pre-specified primary comparison (bank = 5,000), the improvement was confirmed for the two largest-shift settings (eICU-CRD AKI: {Delta}AUROC = +0.012, q < 0.001; eICU-CRD mortality: {Delta}AUROC = +0.026, q < 0.001). Pre-loading a source bank bridged the cold-start gap, providing an immediate performance gain equivalent to approximately 2,000 to 5,000 local patients. Conclusions PRAM provides a parameter-free adaptation mechanism that requires no model retraining, gradient computation, or regulatory re-evaluation at the deployment site. Effect sizes were modest and did not reach cross-model superiority, but the consistent dose-response pattern and the absence of retraining requirements establish retrieval-based adaptation as a viable approach for clinical model transportability. The retrieval mechanism additionally opens a pathway toward case-based interpretability, where predictions are accompanied by identifiable similar patients from the deploying institution.

14

Endoleak Prediction After EVAR: A Point Cloud Neural Network Framework Enhanced by Computational Fluid Dynamics and Multi-Features

Peng, C.; Zhang, Y.; Guo, W.; Zou, L.; Dong, Z.; Jiang, J.; He, W.

2026-01-30 cardiovascular medicine 10.64898/2026.01.27.26345009 medRxiv

Top 0.1%

4.0%

Show abstract

BackgroundEndovascular aortic aneurysm repair (EVAR) is effective in preventing rupture of abdominal aortic aneurysm (AAA), but endoleak remains a serious postoperative complication. Accurate prediction of endoleak risk is a significant clinical challenge. PurposeThis study aimed to evaluate the value of a Point Cloud Neural Network (PCNN) in predicting endoleaks after EVAR by integrating multimodal features. Materials and MethodsWe collected follow-up data from 381 AAA patients. Radiomic characteristics of the procedural intraluminal thrombus and morphological parameters were extracted following medical image segmentation and 3D reconstruction. Hemodynamic parameters, including time-averaged wall shear stress (TAWSS), oscillatory shear index (OSI), and relative residence time (RRT), were obtained through a semi-automated computational fluid dynamics (CFD) workflow. Six traditional machine learning models and four PCNN architectures were developed with progressively added feature sets: 1) medical history and morphology (H+M); 2) H+M+R; 3) H+M+CFD; and 4) all features combined (H+M+R+CFD). ResultsTraditional ML models showed limited performance (AUC range: 0.55-0.77). In contrast, PCNN models demonstrated substantially improved predictive capability. The baseline PCNN (H+M) achieved an AUC of 0.81. The RA-PCNN model incorporating radiomic features showed a 6.58% improvement (AUC=0.86). The CFD-PCNN model with hemodynamic parameters exhibited a 13.0% increase (AUC=0.91), with superior F1-score (0.78) and recall (0.88). The multimodal RA-CFD-PCNN model performed best, achieving an AUC of 0.93, accuracy of 0.90, and F1-score of 0.83. ConclusionThis study establishes a PCNN-based framework for endoleak prediction that significantly outperforms traditional machine learning methods, providing an effective approach for assessing endoleaks in AAA patients. Summary statementThis study developed a PCNN-based framework integrating clinical, morphologic al, radiomic, and hemodynamic features from 381 AAA patients to predict endoleaks after EVAR. Results demonstrated superior performance over traditional ML, with hemodynamic parameters providing a major performance boost, highlighting the value of physiological and biomechanical feature integration for vascular disease prediction. Key ResultsThe multimodal PCNN model integrating all features achieved an AUC of 0.93, significantly outperforming traditional machine learning models (AUCs 0.55-0.77). Incorporating hemodynamic parameters provided the greatest performance increase, with the CFD-PCNN models AUC increasing by 13.0% to 0.91 compared to the baseline PCNN (AUC=0.81). The model combining radiomics and hemodynamics (RA-CFD-PCNN) achieved the highest F1-score of 0.83 and AUC of 0.93, demonstrating robust predictive accuracy.

15

HybridNet-XR: Efficient Teacher-Free Self-Supervised Learning for Autonomous Medical Diagnostic Systems in Resource-Constrained Environments.

Mayala, S.; Mzurikwao, D.; Suluba, E.

2026-03-19 health informatics 10.64898/2026.03.16.26348570 medRxiv

Top 0.1%

3.9%

Show abstract

Deep learning model classification on large datasets is often limited in countries with restricted computational resources. While transfer learning can offset these limitations, standard architectures often maintain a high memory footprint. This study introduces HybridNet-XR, a memory-efficient and computationally lightweight hybrid convolutional neural network (CNN) designed to bridge the domain gap in medical radiography using autonomous self-supervised learning protocols. The HybridNet-XR architecture integrates depthwise separable convolutions for parameter reduction, residual connections for gradient stability, and aggressive early downsampling to minimize the video RAM (VRAM) footprint. We evaluated several training paradigms, including teacher-free self-supervised learning (SSL-SimCLR), teacher-led knowledge distillation (KD), and domain-gap (DG) adaptation. Each variant was pre-trained on ImageNet-1k subsets and fine-tuned on the ChestX6 multi-class dataset. Model interpretability was validated through gradient-weighted class activation mapping (Grad-CAM). The performance frontier analysis identified the HybridNet-XR-150-PW (Pre-warmed) as the optimal configuration, achieving a 93.38% average accuracy and 99% AUC while utilizing only 814.80 MB of VRAM. Regarding class-wise accuracy, this variant significantly outperformed standard MobileNetV2 and teacher-led models in critical diagnostic categories, notably Covid-19 (97.98%) and Emphysema (96.80%). Grad-CAM visualizations confirmed that the teacher-free pre-warming phase allows the model to develop sharper, anatomically grounded focus on pathological landmarks compared to distilled models. Specialized pre-warming schedules offer a viable, computationally autonomous alternative to knowledge distillation for medical imaging. By eliminating the requirement for high-performance teacher models, HybridNet-XR provides a robust and trustworthy diagnostic foundation suitable for clinical deployment in resource-constrained environments. Author summaryTraditional deep learning models for medical imaging are often too large for the low-power computers available in many global health settings. We developed a new model to bridge this computational gap. We designed HybridNet-XR, a highly efficient AI architecture, and trained it using a "teacher-free" method that doesnt require a massive supercomputer. We found a specific version (H-XR150-PW) that provides high accuracy while using very little memory. Our results show that high-performance diagnostic AI can be deployed on standard, low-cost hardware. Furthermore, using visual heatmaps (Grad-CAM), we proved that the AI correctly identifies medical landmarks like lung opacities, ensuring it is safe and reliable for real-world clinical use.

16

Performance Assessment of ECG Delineators on Single-Lead Wearable Ambulatory Data

Chuma, A. T.; Youssef, A. S.; Asmare, M. H.; Wang, C.; Kassie, D. M.; Voigt, J.-U.; Vanrumste, B.

2026-03-26 cardiovascular medicine 10.64898/2026.03.24.26349185 medRxiv

Top 0.1%

3.6%

Show abstract

Reliable interpretation of electrocardiograms (ECGs) requires precise identification of P, QRS, and T (PQRST) wave boundaries. However, it remains challenging due to noise, signal quality variability, and inherent morphological diversity particularly in recordings from children. This study systematically compares the performance of leading deep neural networks (DNN) and heuristic-based delineation algorithms on ambulatory single-lead ECG signals focusing on temporal accuracy. Experiments were conducted using the publicly available LUDB dataset and a private validation dataset comprising 21,759 annotated single-lead wave segments from 611 children recorded using KardiaMobile ECG sensor. DNN were first trained on the LUDB dataset and subsequently tested on the validation dataset. The delineation performance was assessed using Sensitivity (Se) and positive-predictive-value (P+) metrics. The best-performing heuristic based and DNN models reached Se and P+ of (98.9% vs 97.9%) for P, (99.8% vs 99.2%) for QRS, and (98.7% vs 95.9%) for T wave fiducials, respectively. The lowest standard-deviation (in ms) of wave onset/offset delineation was achieved by attention based 1DU-Net model; {+/-}16.6/{+/-}16.3 for P-wave, {+/-}14.0/{+/-}16.3 for QRS, and {+/-}26.3/{+/-}18.8 for T-wave, respectively. The findings indicate that optimized heuristic models can perform comparably to complex DNN, highlighting their efficiency and suitability for real-time ECG delineation in digital health monitoring applications.

17

A Bayesian Framework for Physiologically-Based Modeling of Flutter-Induced Aneurysm Progression

Bhattacharyya, K.

2026-02-11 cardiovascular medicine 10.64898/2026.02.09.26345810 medRxiv

Top 0.1%

3.6%

Show abstract

Current clinical risk stratification for thoracic aortic aneurysms (TAA) relies primarily on maximum diameter, which is a poor predictor of rupture. Recent fluid-structure interaction studies have identified a dimensionless "flutter instability parameter" (N{omega} ) that accurately classifies abnormal aortic growth. However, this parameter currently serves as a static diagnostic snapshot. In this work, we propose a proof-of-concept computational framework that links flutter instability to microstructural tissue damage via a coupled system of ordinary differential equations (ODEs). We model a feedback loop where flutter-induced energy dissipation drives elastin degradation and collagen remodeling, which in turn reduces wall stiffness and amplifies the instability. To address the challenge of unobservable tissue properties, we implement a Bayesian inference engine to infer model parameters. We demonstrate feasibility on a synthetic patient cohort calibrated to published clinical growth rates and diameters. Our results show that this approach can infer hidden damage parameters and capture the qualitative bifurcation between stabilizing remodeling and runaway aneurysm expansion. While validation on real patient data remains essential, this work establishes the mathematical foundation for transforming a static physiomarker into a personalized prognostic trajectory.

18

CardioPulmoNet: Modeling Cardiopulmonary Dynamics for Histopathological Diagnosis

Pham, T. D.

2026-02-20 health informatics 10.64898/2026.02.19.26346620 medRxiv

Top 0.1%

3.6%

Show abstract

ObjectiveThis study investigates whether incorporating physiological coupling concepts into neural network design can support stable and interpretable feature learning for histopathological image classification under limited data conditions. MethodsA physiologically inspired architecture, termed CardioPulmoNet, is introduced to model interacting feature streams analogous to pulmonary ventilation and cardiac perfusion. Local and global tissue features are integrated through bidirectional multi-head attention, while a homeostatic regularization term encourages balanced information exchange between streams. The model was evaluated on three histopathological datasets involving oral squamous cell carcinoma, oral submucous fibrosis, and heart failure. In addition to end-to-end training, learned representations were assessed using linear support vector machines to examine feature separability. ResultsCardioPulmoNet achieved performance comparable to several pretrained convolutional neural networks across the evaluated datasets. When combined with a linear classifier, improved classification performance and higher area under the receiver operating characteristic curve were observed, suggesting that the learned feature embeddings are well structured for downstream discrimination. ConclusionThese results indicate that physiologically motivated architectural constraints may contribute to stable and discriminative representation learning in computational pathology, particularly when training data are limited. The proposed framework provides a step toward integrating physiological modeling principles into medical image analysis and may support future development of transferable and interpretable learning systems for histopathological diagnosis.

19

Explainable Advanced Electrocardiography Heart Age Shows Good Reproducibility in Healthy Young Adults

Warrington, C. R.; Al-Falahi, Z.; Premawardhana, U.; Ugander, M.; Green, S.

2026-03-25 cardiovascular medicine 10.64898/2026.03.24.26349147 medRxiv

Top 0.1%

3.6%

Show abstract

Aims: Explainable advanced electrocardiography (A-ECG) can be used to estimate heart age from the standard 12-lead ECG. A-ECG heart age gap (HAG) represents the difference between A-ECG heart age and chronological age. Increased A-ECG HAG is associated with cardiovascular outcomes and can be used to communicate risk. The aim was to investigate whether A-ECG heart age demonstrates acceptable within- and between-session reproducibility. Methods: Healthy adults (n=42, age 23+/-4 years, 52% male) attended up to two sessions ~14 days apart, with 36 participants completing both sessions. During each session, five standard resting 12-lead ECGs were obtained while lying in the supine position with unchanged electrode positions. A-ECG heart age was extracted using dedicated software. Within-session reproducibility was assessed using all five recorded ECGs with coefficient of variation (CV) and a two-way random effects intraclass correlation coefficient (ICC). Between-session reproducibility was assessed using the first recorded ECG of each session with a paired t-test, CV and ICC. A further analysis assessed the reproducibility of the parameters used in the A-ECG heart age regression model. Results: A-ECG heart age showed excellent within-session reproducibility in session one and two (both CV 5.8%, ICC 0.99). A-ECG heart age was slightly lower in session one than two (24.0+/-7.5 vs. 25.5+/-7.8 years, p=0.04) and showed good between-session reproducibility (CV 8.3%, ICC 0.84). All but one parameter used to estimate A-ECG heart age showed acceptable within- and between-session reproducibility (CV<10%). Conclusion: A-ECG heart age demonstrates excellent within-session reproducibility and good between-session reproducibility in healthy young adults.

20

Design of a Secure Wearable Health Data Sharing Platform for Region Hovedstaden: A FHIR DK and GDPR-Compliant Service Architecture

Chowdhury, A.; Irtiza, A.

2026-03-13 health systems and quality improvement 10.64898/2026.03.12.26348210 medRxiv

Top 0.1%

3.1%

Show abstract

The 1.8 million residents of Region Hovedstaden (Denmarks Capital Region) currently lack a secure, standardized pathway for integrating continuous wearable health data into Sundhed.dk, the national electronic health record. Consumer wearables such as Apple Watch, Oura Ring, and Garmin generate longitudinal physiological data relevant to chronic disease management, yet existing workflows rely on manual, non-standardized exports incompatible with FHIR DK v6.0.2 profiles and GDPR Article 25 privacy-by-design requirements. This paper presents a conceptual five-layer microservice architecture for secure wearable data sharing, employing MitID national authentication, National Service Infrastructure (NSI) integration, and Zero Trust security controls. Requirements were derived from a mixed-methods study including surveys of 47 Danish stakeholders and systematic benchmarking of existing platforms. Results show 51.1% conditional willingness to share wearable data under secure conditions, with audit transparency and non-medical misuse identified as central trust factors. Fourteen MoSCoW-prioritized requirements (F1-F7, NF1-NF7) are mapped to architecture components, providing a traceable blueprint for closing the interoperability gap in Danish public healthcare.